Skip to content

Conversation

@mcbarton
Copy link
Collaborator

Description

Please include a summary of changes, motivation and context for this PR.

This PR will remove the need for the cross compile patch when doing an Emscripten build of llvm.

Fixes # (issue)

Type of change

Please tick all options which are relevant.

  • Bug fix
  • New feature
  • Requires documentation updates

Testing

Please describe the test(s) that you added and ran to verify your changes.

Checklist

  • I have read the contribution guide recently

@mcbarton
Copy link
Collaborator Author

I will update the documentation once I see the ci pass. This is the method I use when avoiding the cross compile patch locally. Not sure every option added to the emscripten configure (emcmake part) is needed. I just took them from this Github issue in llvm llvm/llvm-project#60784

@codecov
Copy link

codecov bot commented May 20, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 76.36%. Comparing base (6951212) to head (9c54f99).
Report is 1 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #583   +/-   ##
=======================================
  Coverage   76.36%   76.36%           
=======================================
  Files           9        9           
  Lines        3655     3655           
=======================================
  Hits         2791     2791           
  Misses        864      864           
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mcbarton mcbarton force-pushed the remove-need-to-cross-compile-patch-unix branch from 8124896 to 8b6933d Compare May 20, 2025 15:25
@anutosh491
Copy link
Collaborator

Hi, I think this is good but as the llvm build is at the heart of everything, I'll quickly verify this build locally and get back !

@anutosh491
Copy link
Collaborator

anutosh491 commented May 21, 2025

Hey @kr-2003, could you please help me verify this as I'm short on time.

References

  1. Back and forth between me and Serge (core clang maintainer) from Dec 2023
So I've made enough progress and I currently have enough understanding of what's
happening to share it with you. I can't invest more time on this than what I've
already be doing, hopefully this is going to put you on track.

= Why do we have this -isystem /usr/include?

emcmake passes a toolchain file (cmake/Modules/Platform/Emscripten.cmake) to
cmake that informs it about the toolchain. In the process it sets
CMAKE_CROSSCOMPILING=ON.

During llvm build process, if CMAKE_CROSSCOMPILING=ON, then a sub build of llvm
happens, a "native" build, to build the native tools needed to bootstrap the
compiler. Mostly llvm-tblgen and clang-tblgen, but they are not the only ones.

the configure step for this "native" build is confused by all the environment
variables set by emcmake, which leads to invalid decision. In particular it does
detect some zlib.h and zstd.h headers in the system (in /usr/include) and tries
to use them. But has it still using em++, or maybe because of some env variable,
this native build keeps on trying to generate wasm (!).

So one way to get rid of these extra header is to force the  "native build" not
to look for zlib or zstd. The following patch does so (it also replaces em++ by
clang++ to prevent some of the side effects mentionned above):

patch -p1 << EOF

--- ./llvm/cmake/modules/CrossCompile.cmake    2023-12-14 10:22:34.750923364
0100
+++ ./llvm/cmake/modules/CrossCompile.cmake2   2023-12-14 10:59:55.217196320
0100
@@ -70,8 +70,10 @@
  add_custom_command(OUTPUT
  \${\${project_name}_\${target_name}_BUILD}/CMakeCache.txt
    COMMAND \${CMAKE_COMMAND} -G "\${CMAKE_GENERATOR}"
        -DCMAKE_MAKE_PROGRAM="\${CMAKE_MAKE_PROGRAM}"
-        -DCMAKE_C_COMPILER_LAUNCHER="\${CMAKE_C_COMPILER_LAUNCHER}"
-        -DCMAKE_CXX_COMPILER_LAUNCHER="\${CMAKE_CXX_COMPILER_LAUNCHER}"
+        -DCMAKE_C_COMPILER="clang"
+        -DCMAKE_CXX_COMPILER="clang++"
+        -DLLVM_ENABLE_ZLIB="\${LLVM_ENABLE_ZLIB}"
+        -DLLVM_ENABLE_ZSTD="\${LLVM_ENABLE_ZSTD}"
        \${CROSS_TOOLCHAIN_FLAGS_\${target_name}} \${CMAKE_CURRENT_SOURCE_DIR}
        \${CROSS_TOOLCHAIN_FLAGS_\${project_name}_\${target_name}}
EOF

while passing

   -DLLVM_ENABLE_ZSTD=OFF                          \
   -DLLVM_ENABLE_ZLIB=OFF                          \

If I understand correctly, forcing the compiler to be clang should be enough, no
need to hack LLVM_ENABLE_ZSTD, I'm just sharing the different steps I've been
exploring.

It is however, not enough, but certainly a track to follow.

Another approach that led me to a final clang.js (\o/) is to skip the "native
build" step. This can be done either through modification of the
cmake/Modules/Platform/Emscripten.cmake file or (and that's what I did) a small
patch:

patch -p1 << EOF

--- llvm/CMakeLists.txt	2023-12-14 14:32:34.608021529 +0100
+++ llvm/CMakeLists.txt2	2023-12-14 14:32:25.722915506 +0100
@@ -762,6 +762,7 @@
  message(FATAL_ERROR "Cannot enable BUILD_SHARED_LIBS with LLVM_LINK_LLVM_DYLIB.  We recommend disabling BUILD_SHARED_LIBS.")
endif()

+set(CMAKE_CROSSCOMPILING OFF)
option(LLVM_OPTIMIZED_TABLEGEN "Force TableGen to be built with optimization" OFF)
if(CMAKE_CROSSCOMPILING OR (LLVM_OPTIMIZED_TABLEGEN AND (LLVM_ENABLE_ASSERTIONS OR CMAKE_CONFIGURATION_TYPES)))
  set(LLVM_USE_HOST_TOOLS ON)
EOF

if you do so, you need to provide your own version of the bootstrapped tool. I
did this by recompiling llvm locally to get llvm-tblgen and clang-tblgen, then
passed

   -DLLVM_TABLEGEN=$HOME/sources/llvm-project/_build/bin/llvm-tblgen \
   -DCLANG_TABLEGEN=$HOME/sources/llvm-project/_build/bin/clang-tblgen

with these changes I've been able to build clang.js, but not the install step,
as clang-ast-dump is another missing tool that needs to be compiled natively and
passed to LLVM, but there's no variable to do so. It would be easy to craft a
patch to do so though.

Again, sorry for not "finishing" the job, but I've used more than the time
credit I wanted to allocate to this. Hopefully this long wall of text will help you.

Serge

Serge and I discussed 2 approaches, we went for approach 1 back then but it seems approach 2 is what Matthew is trying and we need to verify it.

  1. https://github.com/soedirgo/llvm-wasm
  2. https://github.com/YoWASP/clang/blob/0e0259397c85ba52b7370f6be0f52ae08ba3f4d5/build.sh#L91-L98

Verify and let me know if this is good to go !

Once sure of what needs to be done, please update the same changes on emscripten-forge's llvm recipe !

@vgvassilev
Copy link
Contributor

If this passes the ci we are good right?

@anutosh491
Copy link
Collaborator

If this passes the ci we are good right?

Hopefully but as Matthew writes "Not sure every option added to the emscripten configure (emcmake part) is needed"

let's make sure one of us can have a clean build too !

@anutosh491
Copy link
Collaborator

Hi, I am a bit confused here @mcbarton

Was looking into this, but I think the llvm build is being restored through cache isn't it ?

Where can I come across the fresh one involving the changes above ?

@mcbarton
Copy link
Collaborator Author

@anutosh491 I cleared the emscripten builds from the cache before creating this PR. You can see this PR create its cache builds here https://github.com/compiler-research/CppInterOp/actions/runs/15141592541 . The reason the cache builds for this PR no longer exists, is because I deleted them after the ci passed for the latest commit

-DCMAKE_FIND_ROOT_PATH_MODE_PROGRAM=NEVER \
-DCMAKE_FIND_ROOT_PATH_MODE_INCLUDE=ONLY \
-DCMAKE_FIND_ROOT_PATH_MODE_LIBRARY=ONLY \
-DCMAKE_FIND_ROOT_PATH_MODE_PACKAGE=ONLY \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

Have been looking into this. I don't think we should technically need any cmake_XX_YY command here. It's only the *-tblgen executable dirs we need to provide, hence I think only having DLLVM_NATIVE_TOOL_DIR should do the job isn't it ?

Copy link
Collaborator

@anutosh491 anutosh491 May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think should be good to go apart from this. Maybe try removing these flags and building ?

Copy link
Collaborator Author

@mcbarton mcbarton May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using only LLVM_NATIVE_TOOL_DIR worked locally for me. I have cleared the cache of Emscripten builds, squashed the commits. Once the ci passes we should be good to merge, once the cache builds for this branch have been deleted (you can leave this to me if you would like).

One thing to note is that we still seem to build some native stuff as part of the Emscripten build even with this patch removed. If you look at the build you'll see the following Native directory still being built in the Emscripten build part

lib/DebugInfo/PDB/CMakeFiles/LLVMDebugInfoPDB.dir/Native/

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh I think that might just be a side effect of some flag being on by default

once the cache builds for this branch have been deleted (you can leave this to me if you would like).

Cool, i'll just have a look once done, approve and you can take over then

@mcbarton mcbarton force-pushed the remove-need-to-cross-compile-patch-unix branch from 973d977 to 5424013 Compare May 21, 2025 12:26
@mcbarton mcbarton force-pushed the remove-need-to-cross-compile-patch-unix branch from 5424013 to 9c54f99 Compare May 21, 2025 12:42
@mcbarton mcbarton requested a review from anutosh491 May 21, 2025 14:48
@mcbarton mcbarton mentioned this pull request May 21, 2025
4 tasks
@mcbarton
Copy link
Collaborator Author

@anutosh491 can you approve this PR now please? I'm hoping to merge this and the cling 1.2 support PR in tonight when things are quiet, since both are likely to have a disruptive effect while the cache rebuilds.

Copy link
Collaborator

@anutosh491 anutosh491 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks okay to me
Thank you

@vgvassilev vgvassilev merged commit 192da21 into compiler-research:main May 21, 2025
73 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants